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BIST-Based Delay-Fault Testing in FPGAs 



Miron Abramovici 
Circuits and Systems Research Lab 
Agere Systems, Murray Hill, NJ 
miron@agere.com 

Abstract: We present the first delay-fault testing 
approach for FPGAs, applicable both for manufacturing and 
for on-line testing. Our approach is based on BIST, is com- 
prehensive, and does not require expensive ATE. We have 
successfully implemented this BIST approach on the ORCA 
2C series FPGA. 

1. Introduction 

Advances in VLSI technology have resulted in more 
defects affecting the delays in the circuit, thus increasing the 
importance of delay-fault testing. The current FPGA manu- 
facturing testing practice relies on configuring the FPGA 
with many designs and running them at speed. This is useful 
for speed-binning, but it is not a comprehensive delay-fault 
test. One of the main difficulties of the problem of delay-fault 
testing for FPGAs originates in the fact that users* circuits are 
not known when the chip is manufactured. While for an ASIC 
the target clock frequency is known before the chip is fabri- 
cated, it is impossible for the FPGA manufacturer to test the 
circuits that will be implemented on each device. Previous 
work in FPGA delay-fault testing [7][8] considered only the 
testing of the user's circuit However, such an approach is not 
acceptable either to the FPGA manufacturers, who want to 
ship defect-free chips, or to users, who want to purchase 
defect-free parts; otherwise, system-level debugging 
becomes a nightmare if design problems are intermixed with 
manufacturing faults. 

In testing FPGAs, we can take advantage of their unique 
properties - reconfigurability, partial ^configurability, and 
regular structure - to achieve features that cannot be realized 
in ASIC testing. Unlike in ASICs, where BIST requires large 
area overhead and some performance degradation, FPGA 
BIST can be done with zero area overhead and no delay pen- 
alty [10]. While exhaustive testing is practically impossible 
for ASICs, in FPGAs we can exhaustively test both the pro- 
grammable logic blocks (PLBs) and the programmable 
interconnect network, both for off-line [4][12] and on-line 
testing [3][ll] with reasonable test application times. 
Exhaustive testing guarantees maximum fault coverage and 
removes the need to evaluate the test quality. Thus FPGA test 
does not require either ATPG or fault simulation, which are 
computationally expensive tasks for ASICs. Accurate fault 
diagnosis is difficult to achieve in ASICs, but it can be done 
very efficiently in FPGAs [4][3][11]. In ASICs, fault toler- 
ance for manufacturing faults cannot be accomplished 
without massive redundancy, but FPGAs allow efficient 
fault-tolerance without dedicated spare resources [5][6]. 

Our first attempt to do FPGA delay-fault testing was to 
run our BIST configurations at the specified clock rate for the 
target FPGA However, we quickly realized that this cannot 
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work, unless we significantly increased the number of config- 
urations. The number of configurations is the dominant factor 
determining the total test time, because the configuration 
time is several orders of magnitude larger than the pattern- 
application time. To limit the number of configurations, our 
BIST techniques try to test as many resources as possible 
within the same configuration. For off-line PLB test, this 
requires distributing the patterns from the test-pattern gener- 
ator (TPG) to many PLBs under test using long signal paths. 
Similarly, for interconnect test, the wires under test are long, 
connecting many wire segments and switches. Under these 
conditions, we must run the BIST configurations at a fre- 
quency much lower than that used in normal operation, so 
most delay faults will not be detected by these tests. An alter- 
native would be to reduce the amount of logic and/or 
interconnect that is under test during any BIST configuration 
to allow BIST execution at a higher clock frequency; how- 
ever, this will require more configurations, and ultimately 
significant increases in testing time and cost. 

In this paper, we introduce a novel technique for delay- 
fault testing in FPGA. Our technique is independent of the 
applications implemented on the FPGA, and it is applicable 
for both off-line manufacturing testing and for on-line testing 
(the latter in the context of the Roving STARs approach [3]). 
Our method is based on BIST, it is comprehensive, and it can 
work with any low-cost ATE. For our implementation, we 
used the Lattice ORCA 2C series FPGA [16], but our tech- 
nique can also be applied to most other FPGA architectures, 
such as such as the Xilinx Virtex [17]. 

The remainder of this paper is organized as follows. In 
Section 2 we present the principle of the BIST technique, and 
in Section 3 we analyze its implementation issues. In 
Section 4 we discuss the application of the delay-fault BIST 
approach to on-line and off-line testing. Finally, we present 
conclusions in Section 5. 

2, The Main Idea 
An FPGA is composed of PLBs, programmable I/O cells, 
and programmable interconnect; the latter consists of wire 
segments that can be connected via programmable switches 
referred to as configurable interconnect points (CIPs). Wire 
segments in the programmable interconnect network are 
bounded by these CIPs and are considered to be either global 
or local routing resources. Global routing resources connect 
non-adjacent PLBs, while local routing resources connect a 
PLB to global routing resources or to adjacent PLBs. The 
routing resources are bus-oriented, with the number of wires 
per bus typically ranging between 4 and 8. The PLB functions 
and the CIPs are controlled by writing the configuration 
RAM. 
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The basic CIP structure consists of a transmission gate 
controlled by a configuration memory bit (Figure la). There 
are three types of CIPs which we refer to as the cross-point 
CIP (Fig ure lb), the break-point CIP (Figure lc), and the 
multiplexer {MUX) CIP (Figure Id) [16]. While a cross-point 
CIP connects wire segments located in disjoint planes (a hor- 
izontal segment with a vertical one), a break-point CIP 
connects two wire segments in the same plane. The MUX CIP 
comes in two varieties: decoded and non-decoded. A decoded 
MUX CIP is a group of 2 k cross-point CIPs sharing a com- 
mon output wire and controlled by k configuration bits, such 
that the input wire being addressed by the configuration bits 
is connected to the output wire; the decoding logic is incorpo- 
rated between the configuration bits and the transmission 
gates. A non-decoded MUX CIP contains a configuration bit 
for each transmission gate, such that k wire segments are con- 
trolled by k configuration bits; usually only one of the 
configuration bits is active for any configuration. There is also 
a compound CIP (Figure le), which is a combination of four 
cross-point and two break-point CIPs, each separately con- 
trolled by a configuration bit [17]. Most recent FPGA 
interconnect architectures are primarily constructed from 
non-decoded MUX CIPs that are buffered to prevent signal 
degradation due to the series resistance of each transmission 
gate the signal passes through. 
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Figure 2. The basic principle 

enables the local oscillator loop, and ZAST stops the oscilla- 
tions (see Figure 2b). Thus the count of OSC pulses measures 
the difference D between the fastest and the slowest propaga- 
tion delays along the PUTs. In a circuit free of delay faults, D 
should be smaller than a predetermined threshold; otherwise 
we say that a delay fault is detected. Note that the same circuit 
can detect a delay fault affecting the propagation of a 1/0 tran- 
sition, the only difference being that the roles of FIRST and 
LAST are reversed. 

Since the first OSC pulse may be generated (possibl y as a 
partial pulse) even when the transitions of FIRST and LAST 
are very close, a count of one should not be interpreted as 
indicating a delay fault. 
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Figure L Configurable Interconnect Points 

A signal path is formed by connecting several wire seg- 
ments and PLBs m a continuous sequence via multiple CIPs. 
The propagation delay along the path accumulates the delays 
of all its PLBs, segments, and CIPs. A path may have differ- 
ent delays for rising (0/1) and falling (1/0) transitions. 

Figure 2a illustrates our BIST architecture. We configure 
several paths under test (PUTs), so that every path has the 
same sequence of PLBs, wire segments, and CIPs. Each PLB 
on the path is programmed as an identity function, so it 
appears as a buffer for the signal propagating along the path. 
The PUTs are identical, except for their position in the FPGA. 
This works well with die bus structure of the programmable 
interconnect of most FPGAs (with 4 to 8 wires per bus). 

Our technique compares the delays of the PUTs. Assume 
that a rising transition is applied at their common input A This 
transition propagates along every PUT, and it will eventually 
appear at the inputs of the OR and the NAND gates. The sig- 
nal FIRST responds to the fastest arriving transition, while 
TASf changes only after the slowest one has arrived. FIRST 



Figure 3. Typical PLB structure 



Figure 3 illustrates the 
typical structure of a PLB, 
consisting of look-up 
tables (LUTs), flip-flops 
(FFs) that can also be con- 
figured as latches, and 




Figure 4. PUT in PLB 



output MUX logic. Figure 4 shows the structure of a PUT tra- 
versing both a LUT and a FF inside a PLB. The raising input 
transition is applied to all LUT inputs, and the LUT is config- 
ured as an AND gate, whose output propagates the slowest of 
its input transitions. This allows concurrent testing of all 
paths through the LUT. The FF/latch is configured as a latch 
and its clock input is kept at the active value, so that the latch 
will be open and will behave like a buffer. In this way the 
entire PLB implements an identity function. The^ paths 
bypassing the FFs and the paths bypassing the LUTs are 
tested by similar configurations. When propagating a falling 
transition, the LUT should implement an OR gate. 

It is surprising to observe that, unlike ASIC delay-fault 
testing our technique does not involve clocking. This is not a 
problem, since delays on the clock distribution paths are 
implicitly checked during speed-binning tests. Thus our 
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delay-fault BIST should be done in addition to, and not as a 
replacement of, speed-binning. 

The BIST circuitry is very simple: the TPG only needs to 
generate the two transitions, and the output response analyzer 
(ORA) consists of the three gates that produce the oscillation 
and the counter. The counter is reset before each experiment. 
Both the TPG and the counter are controlled, and the counter 
is read, via the FPGA boundary-scan access mechanism. 
3- Implementation Issues 
The smallest difference between the delay of the fastest 
and slowest PUTs detectable with our scheme corresponds to 
one OSC cycle. When testing a path with ASIC-type delay- 
foult testing, the smallest detectable delay fault is generally 
about 5% of the path delay. To achieve a similar feature, 
PUTs should be constructed so that their delay corresponds to 
at least 20 OSC cycles. 

While making PUTs as long as possible would increase 
the number of FPGA resources concurrently tested, thus pos- 
sibly reducing the number of configurations required for a 
complete test, it may also cause false negative results. For 
example, assume a path PI where all of its components 
(PLBs, CIPs, and wire segments) are just 1% slower than 
their counterparts on path P2. If the PUTs involve a large 
number of components, the accumulated difference between 
the delays of PI and P2 may be reported as a delay fault. 
Therefore PUTs should be constructed so that their delay is 
not significantly larger than that of an average "normal path 
that would be used in circuits implemented in the FPGA. 

In any companson-based BIST, a passing result may be 
produced when the compared elements are all faulty; in our 
case, this means that all the compared PUTs are equally slow 
Such a situation is unlikely when we compare several (4 to 8) 
oaths. However, a validation test to protect against this case 
£an be easily done by selecting one of the paths that passed 
the test and comparing it with a new path which was not part 
of the compared group. 

No delay 'faults will be detected in a slow device where all 
paths are equally slow. This is the correct result, and such a 
chip will be identified by speed-binning and may be allowed 
to work as a lower speed-grade. 

Our approach may fail if a PUT has compensating delay 
faults, that is, detecting a slow path segment is masked by the 
presence of a fast segment, so that the overall path delay 
remains about the same as the other PUTs. In general, how- 
ever, most delay faults slow down the circuit, and such a 
multiple fault is unlikely to occur in practice. 

The use of the local oscillator created from the inverting 
feedback m the PLB logic could give rise to concerns of the 
quality of the clock to the Counter of the ORA. Specifically, 
the duty cycle and period needed for proper operation of the 
Counter logic. One solution to this problem is to configure a 
single flip-flop as a toggle flip-flop with ^ output of the 
local oscillator driving the clock input to this flip-flop. This 
effectively divides the local oscillator frequency by 2 and 
ensures a near 50% duty cycle. The lower frequency clock 
will only reduce the resolution of delay fault detection as 
opposed to preventing this delay fault BIST from working. 
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However, we have implemented the delay fault BIST 
approach in an ORCA 2C15A FPGA from Lattice ^con- 
duction and found the oscillator clock to run at 250MHz 
while procuding a duty cycle and clock waveform of suffi- 
cient quality to obtain reporducible results from one excution 
of the delay fault BIST sequence to the next. Therefore, 
dividing the clock may not be necessary. 
4. Application to Off-Line and On-Line Testing 
" Our roving STARs approach [1][3] introduced new tech- 
niques for on-line FPGA testing, diagnosis, and fault- 
tolerance, applicable to any FPGA supporting incremental 
run-time reconfiguration (RTR) via its boundary-scan inter- 
face [151. A STAR (self-testing area) is a temporarily off-line ^ 
section of the FPGA where self-testing occurs without dis- 
turbing the normal system activity in the rest of the chip. 
Roving the STARs periodically brings every section of the 
FPGA under test. Our approach guarantees complete testing 
of the FPGA, including all its spare resources, and does not 
require any part of the chip to be fault-free. In this section, we 
discuss the application of our delay-fault BIST to roving 
STARs for delay fault detection. 

Figure 5 depicts an FPGA 
with a vertical STAR (V- 
STAR) and an horizontal 
STAR (H-STAR); the system 
application resides in the work- 
ing areas denoted by the white 
boxes representing working 
PLBs. Note that global hori- 
zontal routing resources in V- 
STAR and global vertical rout- 
ing resources in H-STAR may 
be used by the system signals 

connecting the working areas separated by the STARs far 
rial RTR via the boundary scan interface allows the test 
configurations used by STARs to be do^ded without 
impacting the system operation. After self-testing of a STAR 
has been completed (both for PLBs and interconnect) the 
STAR roves to a new location, by exchanging places with an 
equal-size slice of the working area; roving the STARs across 
the FPGA is implemented by a sequence of precomputed par- 
tial reconfigurations and assures that the entire FPGA will be 
eventually tested. The roving process and the use of roving 
STARs for test and diagnosis of PLBs are described in detail 
in [1] and [3]. 

Testing for delay faults follows the pattern of interconnect 
testingin our on-line routing BiST [11], wh^ 
vertical routing resources are tested m H-STAR and V- 
STAR, respectively, as illustrated in Figure 6a.Delay faults in 
die PLBs can also be tested as part of the mterconect delay 
Suit testing sequence. As a result, no additional test phases 
arerequired fo"the PLB logic since these are included when 
local interconnect delay fault testing is performed. Testmg 
fordelay faults in the cross-point connecting gob^ 
izontal and vertical routing busses must involve both STARs 
and can only be performed at the intersection of the two 
STARS, as iJL in Figure 6b. Table «»«-»*» 
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Figure 5. STARs in FPGA 
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b) delay fault testing of 
horizontal & vertical STARs global-to-global cross-points 
Figure 6. On-line delay fault testing using STARs 

delay-fault test of a Lattice ORCA 2C series FPGA in terms 
of the number of test phases that must be executed in each 
STAR position. 

One way of characterizing 
the difference between on-line 
and off-line testing is that no 
system function exists during 
off-line (manufacturing) test- 
ing. Hence for off-line testing, 
we can populate the entire 
FPGA with a "galaxy*' of par- 
allel STARs (either vertical or 
horizontal), all executing con- 
currently the same delay-fault 
BIST as illustrated in Figure 7. 
The entries in Table I represent the complete set of delay 
fault configurations. This set of BIST configurations is the 
same for both on-line and off-line testing. This is because off- 
line testing for interconnect is best served by the galaxy 
testing approach and because delay fault testing of the PLB 
resources (LUTs and latches) can be performed in conjunc- 
tion with the delay fault testing of local interconnect 
resources. 

Table 1. Summary of delay-fault BIST test sessions 
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Figure 7. Galaxy testing 
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global routing 
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" multiDlexer CI^s & PLB logic 
;ross-pomt CIPs between global busses 


3 
7 


4 



5. Conclusions 

We have presented the first delay-fault testing approach 
for FPGAs. The approach is applicable both for off-line man- 
ufacturing and system-level testing as well as for on-line 
testing with our roving STARs approach for fault tolerance 
Our approach is based on BIST, it is comprehensive, and does 
not require expensive ATE. We have successfully imple- 
mented this BIST approach on the ORCA 2C series FPGA 



and have verified that the approach is not only feasible but 
also practical. Once delay faults have been detected, the next 
step for research and development is diagnosis of the delay 
fault for off-line yield enhancement as well as for fault toler- 
ance for delay faults, this is one of our areas of focus for the 
future. 
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Delay Fault Testing with Both Transitions 



Chuck Stroud 



The current implementation that Matt did used TEN for the test pattern which allowed the 
low-to-high transition for the delay fault testing. However, you cannot do the bigh-to-low 
transition test since turning off TEN also shuts off teh Boundary ^"^^JSS 
to do both transitions you need to have a TPG for the generation of the tes t signa I md the 
test sequence need to be coordinated with the size of the counters used for the _de ay _ feu t 
test (including the comparison of the wires under test as well as the test for a "fast signal 
Therefore, the following test arrangement should work. Given that the maximum expected 
count values for the "fast" signal test is F and for the regular test is T, then the size of the 
TPG counter in terms of the number of bits N should be: 
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> clear 
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The counter needs to be larger so that the needed control to enable the counts only dunn 
the appropriate transition test sequence and then to hold the values in the counters after 
each transition test sequence is complete. Before taking TEN back low, the the count val- 
ues need to be shifted out of the Boundary Scan interface. Of course this will take some 
additional logic to control the tranfer of the count data from the reference clock (from the 
on-board oscillator or a high frequency extrenal clock) to TCK. The control logic to the 
counters and the logic to transfer the data for scaning out on TCK is not shown in the fig- 



ure. 
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